home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
tex
/
filtr21.zip
/
FILTER21.DOC
< prev
Wrap
Text File
|
1994-06-14
|
9KB
|
201 lines
"filter", v. 2.1 Latest mod: 12:30 Jun 8 1994
A `grep'-like text searcher for multiple simultaneous keyword tests
Copyright 1994 by Joel Polowin, Department of Chemistry, Queen's University,
Kingston, Ontario, Canada. Permission granted for free use and distribution;
I want credit/blame for writing it. E-mail: polowin@silicon.chem.queensu.ca,
polowinj@qucdn.queensu.ca, Joel.Polowin@p4.f107.n249.z1.fidonet.org .
If you see something wrong with it or it fails to work, PLEASE let me know!
Syntax: filter [filename] [filename ...] string [string ...]
where each string (default max. of 2000) is a term to be searched for
in lines (default max. 600 chars) in file(s) `filename', prefixed by one
of the following characters:
+ to show lines which contain string
- to show lines which do not contain string
= to show lines which contain string, case sensitive
_ (underscore) to show lines which do not contain string,
case sensitive
A string as above may be further prefixed with the letter 'o' to print the
line if the current OR the preceding condition is true.
A string including blanks and the prefix may be enclosed in double quotes
on most systems. Your operating system may have other ways of dealing
with special characters.
"filter" determines the first string which is a search term instead of a
file name by its beginning with one of the characters `+-=_'. For files
whose names begin with one of these characters, see below. Otherwise, the
first search term must begin with one of these, as that first term cannot
be `or'-linked to a preceding term.
For strings that begin with `$', `&', or `^' (usually names of files of search
terms), see below.
Examples:
filter * +hawk +handsaw o+hound
searches all files in the current directory and prints lines that
contain the string `hawk' and at least one of `handsaw' or `hound'.
This assumes that the operating system and compiler accept wild-carded
file names; else "filter" will be looking for a file named `*'. For
DOS, one would use `*.*'.
filter armorial =Vert +argent -gules _Or -azure -purpur +foil > tempfile.txt
searches the file `armorial' for lines that contain the string `Vert'
(case-sensitive) and `argent' (upper or lower case) but not `gules'
(upper or lower case) and not `Or' (case-sensitive) and not `azure'
or `purpur' (upper or lower case) and DO contain `foil' (upper or
lower case); the resulting lines are saved in file `tempfile.txt'.
type temp1.txt | filter +aardvark "o+winged pig" o+wombat "_|B|"
The file `temp1.txt' is fed through the "filter" program, which passes
lines that contain `aardvark' (upper or lower case) or the string
`winged pig' or `wombat' (upper or lower case) and do NOT contain
`|B|'. The result is printed on the screen. Note use of quotation
marks in the command line to include the space in `winged pig' and the
special character `|' in `|B|'.
----------
File names beginning with one of `+-=_'
If you absolutely MUST use text file names that begin with one of these
characters, use the character twice when specifying the file name to
"filter". Thus, the file name `-stdev.c' would be written `--stdev.c';
`++junk.c' would be written `++++junk.c'.
What "filter" does is to go through each term in the command line and
count the number of identical flag characters beginning each; that number
is reduced by half, rounded down. An even number specifies a text file name;
an odd number designates a search term. `+junk' has one flag character,
is not changed (shortened by 1/2 -> 0 characters), and is a search term:
print lines containing `junk'. `++junk' has two flag characters, is
shortened to `+junk', and is read as a text file name. `+++junk' is shortened
to `++junk', and is a search term: print lines containing the string `+junk'.
`+=junk' has one flag character and is a search term: print lines containing
the string `=junk'.
This means that wild-carded file names that match files whose names begin
with one of `+-=_' will cause trouble. I'm sorry; the telepathic monitors
of most computer systems are not software-addressable. A compulsive urge
to use files whose names begin with punctuation or mathematical symbols
can now be treated successfully in a majority of cases.
Search terms specified in files (see below) are not themselves in the
command line, and if they begin with one of `+-=_' those characters should
not be doubled. Search-term file expansion takes place after "filter"
determines which command-line strings are file names.
----------
`$',`&', and `^' usually flag search-term file names.
A string which begins with `$', `&', or `^' will be expanded as the name of a
file containing a list of search terms. For example, the string +$critters
tells "filter" to look for a file `critters'; lines from that file are taken
as search terms.
If the file name is specified with `^', terms INCLUDING `+-=_' prefixes are
read from the file. The string specifying the file name is replaced in the
command line by the list of terms read from the file; the original prefix
is ignored. "filter" does not add `+-=_' prefixes to the terms.
If the file name is specified with `$' or `&', "filter" adds the prefix to
each term, depending on which of `$&' is used and which of `+-=_' precedes it.
With `$', `+' and `=' give or-linked terms, so that text file lines will be
printed if any search-term file line is matched; `-' and `_' are not
or-linked, so that text file lines are printed only if no search-term file
line is matched. With `&', `+' and `=' are not or-linked, so that ALL
search-term lines must be matched to print a text line; while `-' and `_'
ARE or-linked, so that any search-term line NOT matched will allow text-line
printing.
To search for actual text strings beginning with `$', `&', or `^', double the
flag characters. Thus, to search for the string `$100', use the search
string `+$$100'. To expand file names beginning with those characters,
use three of them: search-term file `$&junk' would be specified with something
like `-$$$&junk'. In general, when a search term begins with a flag
character, double each flag character of that kind beginning the term, and
if the term is a file name, add an extra flag character.
Search-term files may contain file names, which will be expanded in turn.
For this reason, initial `$', `&', and `^' characters must be doubled even in
nested search-term files.
Note that the or-linking logic can get seriously messed up when terms
beginning with `-' or `_' are expanded carelessly, as "filter" has no good
sense of logical precedence. If file `human' contains `man' and `woman',
then `o-$human' would expand as `o-man -woman'.
Examples:
filter armorial +$animal o+$vegetable _^mineral
If file `animal' reads
$human
reptile
amphibian
and file `human' reads
man
woman
and file `vegetable' reads
tree
grain
and file `mineral' reads
+rock
-dirt
then the above will be expanded to:
filter armorial +man o+woman o+reptile o+amphibian o+tree o+grain +rock
-dirt
filter armorial +$&beastie +&&&doggie -$dragon
If file `&beastie' reads
unicorn
$dragon
manticore
and file `&doggie' reads
terrier
hound
$$paniel
and file `dragon' reads
wyvern
dragon
lizard
then the above will be expanded to
filter armorial +unicorn o+wyvern o+dragon o+lizard o+manticore +terrier
+hound +$$paniel -wyvern -dragon -lizard
which will print lines from file `armorial' that contain:
any of: `unicorn', `wyvern', `dragon', `lizard', `manticore'; and
ALL of: `terrier', `hound', `$paniel'; and
none of: `wyvern', `dragon', `lizard'.
Revision history:
Version 1.0 September 1992.
1.1 Sep '92 fixed minor bugs
1.2 Sep '92 added 'or'-linking to keywords
1.4 Oct '92 fixed a minor error in string lengths, added size DEFINEs
1.5